play style
Style-Preserving Policy Optimization for Game Agents
Li, Lingfeng, Lu, Yunlong, Wang, Yongyi, Li, Wenxin
Proficient game agents with diverse play styles enrich the gaming experience and enhance the replay value of games. However, recent advancements in game AI based on reinforcement learning have predominantly focused on improving proficiency, whereas methods based on evolution algorithms generate agents with diverse play styles but exhibit subpar performance compared to RL methods. To address this gap, this paper proposes Mixed Proximal Policy Optimization (MPPO), a method designed to improve the proficiency of existing suboptimal agents while retaining their distinct styles. MPPO unifies loss objectives for both online and offline samples and introduces an implicit constraint to approximate demonstrator policies by adjusting the empirical distribution of samples. Empirical results across environments of varying scales demonstrate that MPPO achieves proficiency levels comparable to, or even superior to, pure online algorithms while preserving demonstrators' play styles. This work presents an effective approach for generating highly proficient and diverse game agents, ultimately contributing to more engaging gameplay experiences.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Texas (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Play Style Identification Using Low-Level Representations of Play Traces in MicroRTS
Xia, Ruizhe Yu, Gow, Jeremy, Lucas, Simon
Play style identification can provide valuable game design insights and enable adaptive experiences, with the potential to improve game playing agents. Previous work relies on domain knowledge to construct play trace representations using handcrafted features. More recent approaches incorporate the sequential structure of play traces but still require some level of domain abstraction. In this study, we explore the use of unsupervised CNN-LSTM autoencoder models to obtain latent representations directly from low-level play trace data in MicroRTS. We demonstrate that this approach yields a meaningful separation of different game playing agents in the latent space, reducing reliance on domain expertise and its associated biases. This latent space is then used to guide the exploration of diverse play styles within studied AI players.
- Europe > United Kingdom > England > Greater London > London (0.41)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Asia > China > Beijing > Beijing (0.04)
CognitionNet: A Collaborative Neural Network for Play Style Discovery in Online Skill Gaming Platform
Talwadker, Rukma, Chakrabarty, Surajit, Pareek, Aditya, Mukherjee, Tridib, Saini, Deepak
Games are one of the safest source of realizing self-esteem and relaxation at the same time. An online gaming platform typically has massive data coming in, e.g., in-game actions, player moves, clickstreams, transactions etc. It is rather interesting, as something as simple as data on gaming moves can help create a psychological imprint of the user at that moment, based on her impulsive reactions and response to a situation in the game. Mining this knowledge can: (a) immediately help better explain observed and predicted player behavior; and (b) consequently propel deeper understanding towards players' experience, growth and protection. To this effect, we focus on discovery of the "game behaviours" as micro-patterns formed by continuous sequence of games and the persistent "play styles" of the players' as a sequence of such sequences on an online skill gaming platform for Rummy. We propose a two stage deep neural network, CognitionNet. The first stage focuses on mining game behaviours as cluster representations in a latent space while the second aggregates over these micro patterns to discover play styles via a supervised classification objective around player engagement. The dual objective allows CognitionNet to reveal several player psychology inspired decision making and tactics. To our knowledge, this is the first and one-of-its-kind research to fully automate the discovery of: (i) player psychology and game tactics from telemetry data; and (ii) relevant diagnostic explanations to players' engagement predictions. The collaborative training of the two networks with differential input dimensions is enabled using a novel formulation of "bridge loss". The network plays pivotal role in obtaining homogeneous and consistent play style definitions and significantly outperforms the SOTA baselines wherever applicable.
- Asia > India (0.05)
- Oceania > New Zealand (0.04)
- Europe > United Kingdom > Scotland (0.04)
A Hierarchical Approach to Population Training for Human-AI Collaboration
Loo, Yi, Gong, Chen, Meghjani, Malika
A major challenge for deep reinforcement learning (DRL) agents is to collaborate with novel partners that were not encountered by them during the training phase. This is specifically worsened by an increased variance in action responses when the DRL agents collaborate with human partners due to the lack of consistency in human behaviors. Recent work have shown that training a single agent as the best response to a diverse population of training partners significantly increases an agent's robustness to novel partners. We further enhance the population-based training approach by introducing a Hierarchical Reinforcement Learning (HRL) based method for Human-AI Collaboration. Our agent is able to learn multiple best-response policies as its low-level policy while at the same time, it learns a high-level policy that acts as a manager which allows the agent to dynamically switch between the low-level best-response policies based on its current partner. We demonstrate that our method is able to dynamically adapt to novel partners of different play styles and skill levels in the 2-player collaborative Overcooked game environment. We also conducted a human study in the same environment to test the effectiveness of our method when partnering with real human subjects.
NBA2Vec: Dense feature representations of NBA players
Guan, Webster, Javed, Nauman, Lu, Peter
Understanding a player's performance in a basketball game requires an evaluation of the player in the context of their teammates and the opposing lineup. Here, we present NBA2Vec, a neural network model based on Word2Vec which extracts dense feature representations of each player by predicting play outcomes without the use of hand-crafted heuristics or aggregate statistical measures. Specifically, our model aimed to predict the outcome of a possession given both the offensive and defensive players on the court. By training on over 3.5 million plays involving 1551 distinct players, our model was able to achieve a 0.3 K-L divergence with respect to the empirical play-by-play distribution. The resulting embedding space is consistent with general classifications of player position and style, and the embedding dimensions correlated at a significant level with traditional box score metrics. Finally, we demonstrate that NBA2Vec accurately predicts the outcomes to various 2017 NBA Playoffs series, and shows potential in determining optimal lineup match-ups. Future applications of NBA2Vec embeddings to characterize players' style may revolutionize predictive models for player acquisition and coaching decisions that maximize team success.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > Italy (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
Holmgård
In this paper we describe a method of modeling play styles as deviations from approximations of game theoretically rational actions. These deviations are interpreted as containing information about player skill and player decision making style. We hypothesize that this information is useful for differentiating between players and for understanding why human player behavior is attributed intentionality which we argue is a prerequisite for believability. To investigate these hypotheses we describe an experiment comparing 400 games in the Mario AI Benchmark testbed, played by humans, with equivalent games played by an approximately game theoretically rationally playing AI agent. The player actions' deviations from the rational agent's actions are subjected to feature extraction, and the resulting features are used to cluster play sessions into expressions of different play styles. We discuss how these styles differ, and how believable agent behavior might be approached by using these styles as an outset for a planning agent. Finally, we discuss the implications of making assumptions about rational game play and the problematic aspects of inferring player intentions from behavior.
Valls-Vargas
Existing work on player modeling often assumes that the play style of players is static. However, our recent work shows evidence that players regularly change their play style over time. In this paper we propose a novel player modeling framework to capture this change by using episodic information and sequential machine learning techniques. In particular, we experiment with different trace segmentation strategies for play style prediction. We evaluate this new framework on gameplay data gathered from a game-based interactive learning environment. Our results show that sequential machine learning techniques that incorporate predictions from previous segments outperform non-sequential techniques. Our results also show that too fine (minute-by-minute) or too coarse (whole trace) segmentation of traces decreases performance.
Reinforcement Learning on Human Decision Models for Uniquely Collaborative AI Teammates
In 2021 the Johns Hopkins University Applied Physics Laboratory held an internal challenge to develop artificially intelligent (AI) agents that could excel at the collaborative card game Hanabi. Agents were evaluated on their ability to play with human players whom the agents had never previously encountered. This study details the development of the agent that won the challenge by achieving a human-play average score of 16.5, outperforming the current state-of-the-art for human-bot Hanabi scores. The winning agent's development consisted of observing and accurately modeling the author's decision making in Hanabi, then training with a behavioral clone of the author. Notably, the agent discovered a human-complementary play style by first mimicking human decision making, then exploring variations to the human-like strategy that led to higher simulated human-bot scores. This work examines in detail the design and implementation of this human compatible Hanabi teammate, as well as the existence and implications of human-complementary strategies and how they may be explored for more successful applications of AI in human machine teams.
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
Instructive artificial intelligence (AI) for human training, assistance, and explainability
Kantack, Nicholas, Cohen, Nina, Bos, Nathan, Lowman, Corey, Everett, James, Endres, Timothy
We propose a novel approach to explainable AI (XAI) based on the concept of "instruction" from neural networks. In this case study, we demonstrate how a superhuman neural network might instruct human trainees as an alternative to traditional approaches to XAI. Specifically, an AI examines human actions and calculates variations on the human strategy that lead to better performance. Experiments with a JHU/APL-developed AI player for the cooperative card game Hanabi suggest this technique makes unique contributions to explainability while improving human performance. One area of focus for Instructive AI is in the significant discrepancies that can arise between a human's actual strategy and the strategy they profess to use. This inaccurate self-assessment presents a barrier for XAI, since explanations of an AI's strategy may not be properly understood or implemented by human recipients. We have developed and are testing a novel, Instructive AI approach that estimates human strategy by observing human actions. With neural networks, this allows a direct calculation of the changes in weights needed to improve the human strategy to better emulate a more successful AI. Subjected to constraints (e.g. sparsity) these weight changes can be interpreted as recommended changes to human strategy (e.g. "value A more, and value B less"). Instruction from AI such as this functions both to help humans perform better at tasks, but also to better understand, anticipate, and correct the actions of an AI. Results will be presented on AI instruction's ability to improve human decision-making and human-AI teaming in Hanabi.
Identification of Play Styles in Universal Fighting Engine
Yuda, Kaori, Kamei, Shota, Tanji, Riku, Ito, Ryoya, Wakana, Ippo, Mozgovoy, Maxim
We also compare play styles of people with the style human-like manner, exhibiting a diversity of play styles and exhibited by a built-in AI system. Finally, we report results of strategies. Thus, the development of fighting game AI a short survey, aimed to reveal whether human observers can requires the ability to evaluate these properties. For instance, spot "human-like" traits in the behavior of game characters. it should be possible to ensure that the characters created are believable and diverse. In this paper, we show how an UNIVERSAL FIGHTING ENGINE PLATFORM automated procedure can be used to compare play styles of Universal Fighting Engine (Mind Studios 2021) is a highly individual AIand human-controlled characters, and to assess customizable platform for one-vs-one fighting games human-likeness and diversity of game participants.
- North America > United States > New Jersey > Middlesex County > Piscataway (0.05)
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)